Building a Lexical Knowledge-Base of Near-Synonym Differences
نویسندگان
چکیده
Building a Lexical Knowledge-Base of Near-Synonym Differences Diana Inkpen Doctor of Philosophy Graduate Department of Computer Science University of Toronto 2004 Current natural language generation or machine translation systems cannot distinguish among near-synonyms—words that share the same core meaning but vary in their lexical nuances. This is due to a lack of knowledge about differences between near-synonyms in existing computational lexical resources. The goal of this thesis is to automatically acquire a lexical knowledge-base of near-synonym differences (LKB of NS) from multiple sources, and to show how it can be used in a practical natural language processing system. I designed a method to automatically acquire knowledge from dictionaries of near-synonym discrimination written for human readers. An unsupervised decision-list algorithm learns patterns and words for classes of distinctions. The patterns are learned automatically, followed by a manual validation step. The extraction of distinctions between near-synonyms is entirely automatic. The main types of distinctions are: stylistic (for example, inebriated is more formal than drunk), attitudinal (for example, skinny is more pejorative than slim), and denotational (for example, blunder implies accident and ignorance, while error does not). I enriched the initial LKB of NS with information extracted from other sources. First, information about the senses of the near-synonym was added (WordNet senses). Second, knowledge about the collocational behaviour of the near-synonyms was acquired from free text. Collocations between a word and the near-synonyms in a dictionary entry were classified into: preferred collocations, less-preferred collocations, and anti-collocations. Third, knowledge
منابع مشابه
Building and Using a Lexical Knowledge Base of Near-Synonym Differences
Choosing the wrong word in a machine translation or natural language generation system can convey unwanted connotations, implications, or attitudes. The choice between near-synonyms such as error, mistake, slip, and blunder — words that share the same core meaning, but differ in their nuances — can be made only if knowledge about their differences is available. We present a method to automatica...
متن کاملAcquiring Collocations For Lexical Choice Between Near-Synonyms
We extend a lexical knowledge-base of near-synonym differences with knowledge about their collocational behaviour. This type of knowledge is useful in the process of lexical choice between near-synonyms. We acquire collocations for the near-synonyms of interest from a corpus (only collocations with the appropriate sense and part-of-speech). For each word that collocates with a nearsynonym we us...
متن کاملExperiments on Extracting Knowledge from a Machine-Readable Dictionary of Synonym Differences (Invited Talk)
In machine translation and natural language generation, making the wrong word choice from a set of near-synonyms can be imprecise or awkward, or convey unwanted implications. Using Edmonds’s model of lexical knowledge to represent clusters of near-synonyms, our goal is to automatically derive a lexical knowledge-base from the Choose the Right Word dictionary of near-synonym discrimination. We d...
متن کاملExperiments on Extracting Knowledge from a Machine-Readable Dictionary of Synonym Differences
In machine translation and natural language generation, making the wrong word choice from a set of near-synonyms can be imprecise or awkward, or convey unwanted implications. Using Edmonds’s model of lexical knowledge to represent clusters of near-synonyms, our goal is to automatically derive a lexical knowledge-base from the Choose the Right Word dictionary of near-synonym discrimination. We d...
متن کامل‘Repetition’ in Arabic-English Translation: The case of Adrift on the Nile
Abstract This study investigates ‘repetition’ in the English translation of the Arabic Novel, Adrift on the Nile (1993). It aims to explore the communicative functions of ‘repetition’ and to see if these functions have been maintained or lost in the process of translating the Novel. In addition, it seeks to find the translation strategies used in rendering ‘repetition’. To achieve this aim, a d...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2001